3D FFT with 2D decomposition

نویسنده

  • Roland Schulz
چکیده

Many scientific applications including molecular dynamics (MD) require a fast fourier transform (FFT). As the number of processors for high performance computer increases this transform has to be parallelized to larger number of processors to remove it as a bottleneck for the parallelization. This requires the decomposition to be changed from 1D to 2D. Such a 2D decomposed 3D FFT was implemented as this project. With the 2D decomposition the limiting factor becomes the required global transpose. This transpose can be speeded up by using FFTW transpose instead of the standard MPI MPI_Alltoall. Having this 2D decomposed 3D FFTW allows to improve the scaling of the fastest available MD software.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel implementation and scalability analysis of 3D Fast Fourier Transform using 2D domain decomposition

3D FFT is computationally intensive and at the same time requires global or collective communication patterns. The efficient implementation of FFT on extreme scale computers is one of the grand challenges in scientific computing. On parallel computers with a distributed memory, different domain decompositions are possible to scale 3D FFT computation. In this paper, we argue that 2D domain decom...

متن کامل

Parallel Fourier Transformations using shared memory nodes

The Fast Fourier Transform (FFT) is of great importance for various scientific applications used in High Performance Computing (HPC). However, a detailed performance analysis shows that the FFT routines used in these applications, prevent them from scaling to large processor counts. The All-to-All type communication required inside these transformation routines, which becomes extremely costly w...

متن کامل

A Portable 3D FFT Package for Distributed-Memory Parallel Architectures

1 I n t r o d u c t i o n Multidimensional FF’I’s are used frequently in engineerillg and scientific calculations, especially in image processing. Parallel implementations of FFT generally follow two approaches. One is the binary-exchange approach[l ,2], where data exchanges take place in all pairs of processors with processor numbers differing by one bit. Another one is the transpose approach[...

متن کامل

PPMC: A Programmable Pattern Based Memory Controller

One of the main challenges in the design of hardware accelerators is the efficient access of data from the external memory. Improving and optimizing the functionality of the memory controller between the external memory and the accelerators is therefore critical. In this paper, we advance toward this goal by proposing PPMC, the Programmable Pattern-based Memory Controller. This controller suppo...

متن کامل

The Comparison 2D and 3D Treatment Planning in Breast Cancer Radiotherapy with Emphasis on Dose Homogeneity and Lung Dose

Introduction: Breast conserving radiotherapy is one of the most common procedures performed in any radiation oncology department. A tangential parallel-opposed pair is usually used for this purpose. This technique is performed using 2D or 3D treatment planning systems. The aim of this study was to compare 2D treatment planning with 3D treatment planning in tangential irradiation in breast conse...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008